Skip to content

[ENH] Addition of Set representation and support-aware plotting#983

Open
Khushmagrawal wants to merge 1 commit intosktime:mainfrom
Khushmagrawal:symbolic-set-api
Open

[ENH] Addition of Set representation and support-aware plotting#983
Khushmagrawal wants to merge 1 commit intosktime:mainfrom
Khushmagrawal:symbolic-set-api

Conversation

@Khushmagrawal
Copy link
Contributor

Reference Issues/PRs

Fixes #244

What does this implement/fix? Explain your changes.

I have implemented hierarchy of set classes to represent the mathematical support of distributions: RealSet, IntegerSet, IntervalSet, FiniteSet, and EmptySet, UnionSet and IntersectionSet. These sets are designed to be tabular, meaning a single set object can store parameters (bounds/values) as arrays

The BaseDistribution now includes a support property). This would allow distributions to programmatically communicate where their probability mass/density resides. This method was overrided currently only in few distributions to test different types of set - Binomial (IntegerSet), Poisson (IntegerSet), Empirical (FiniteSet), TruncatedDistribution (IntersectionSet), and ZeroInflated (UnionSet) have been updated.

The plot functionality has been changed to use support.boundary() to determine exact x-axis limits. For discrete and mixed distributions, the plotter now queries the support for FiniteSet or IntegerSet components to identify exact coordinates for stem plots.

Does your contribution introduce a new dependency? If yes, which one?

No new dependencies are introduced.

What should a reviewer concentrate their feedback on?

  • _plot_single, mixed distributions (like ZeroInflated) for pdf currently plot the continuous part as a line and the discrete mass as stems. Mathematically, a PDF of a mixed distribution includes Dirac delta spikes (infinite density). To distinguish Mass from Density, we mask the PDF line at spike points using np.isclose, would that be right?
  • Broadcasting logic in _get_bc_params (We currently utilize oned_as="col" to force 1D parameter arrays ) into column vectors (N, 1))
  • In UnionSet and IntersectionSet, we allow a mix of scalar and tabular children. We currently force the output to match the parent set's ndim
  • Are the current Set types sufficient?

Did you add any tests for the change?

Yes, test_set.py has been added to verify the symbolic set logic

Any other comments?

Includes a minor fix for ZeroInflated._iloc to ensure scalar-scalar indexing correctly delegates to _iat, which was necessary for cell-level plotting in array distributions

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
    How to: add yourself to the all-contributors file in the skpro root directory (not the CONTRIBUTORS.md). Common badges: code - fixing a bug, or adding code logic. doc - writing or improving documentation or docstrings. bug - reporting or diagnosing a bug (get this plus code if you also fixed the bug in the PR).maintenance - CI, test framework, release.
    See here for full badge reference
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG]. [BUG] - bugfix, [MNT] - CI, test framework, [ENH] - adding or improving code, [DOC] - writing or improving documentation or docstrings.

@Khushmagrawal
Copy link
Contributor Author

Adding two plotting examples here as well, the current plotting shows the continuous density and the point mass at zero separately and the support is {0} union R.

zin

For zero-inflated Poisson, the support is {0} union the inner discrete support, and the plot shows the added mass at zero

zip

@Khushmagrawal Khushmagrawal changed the title [ENH] Addition of Set representation and support-aware plotting. [ENH] Addition of Set representation and support-aware plotting Mar 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] inspectable set-valued domains for distributions

1 participant